Training YOLOv3 Models for Object Detection
The YOLOv3 model, which uses pre-trained weights for standard object detection problems, is accessible from the menu item Artificial Intelligence > Custom Deep Model Architecture > YOLOv3. This feature simplifies the training process and lets you apply trained models on selected datasets. The results of applying a YOLOv3 object detection model are a series of 2D boxes around the objects of interest, as shown below.
From left to right: kidneys, yeast cells, drosophila brain
The following publications provide additional information about YOLOv3.
-
You Only Look Once: Unified, Real-Time Object Detection (https://arxiv.org/abs/1506.02640)
-
YOLOv3: An Incremental Improvement (https://arxiv.org/pdf/1804.02767.pdf)
The following items are required for training YOLOv3 for object detection:
- Training dataset(s) for the input. See Cropping Datasets for information about extracting a training dataset as a subset of the original data.
- A target for the output, which must be a multi-ROI (see Labeling Multi-ROIs for Object Detection).
The following items are optional for training YOLOv3:
- An ROI mask(s), for defining the working space for the model. For example, if you want to exclude some labeled areas from the training, then you would need a mask.
Note Refer to the topic Creating Mask ROIs for information about creating mask ROIs.
Multi-ROIs that are used as the target output for object detection must meet the following conditions:
- The multi-ROI must have the same geometry as the input training data.
- All voxels contained within the input patches must be labeled to be considered during training and each instance of an object needs to be labeled. Patches that are not fully segmented will be ignored.
Note Applying a mask may limit the number of input patches that are processed.
- Label 1 must contain a fully labeled background. Other labels must contain the object classes to detect. For example, the multi-ROI for a kidney detection model may contain the following classes:

Note You should note that 'n' class models require multi-ROIs with n+1 labels.
You should note that object detection models require boxes as an input. Dragonfly automatically internally converts each multi-ROI island into a box so that both segmentations shown below are valid training outputs. You should also note that the background is fully labeled.
Segmentation options for object detection
Patch size, class count, and input count can be selected in the Model Information dialog whenever you generate a new YOLOv3 model.
Click the New button on the Train tab to open the dialog, shown below.
Model Information dialog
| Description | |
|---|---|
| Patch size |
During training, training data is split into smaller 2D data patches, which is defined by the 'Patch size' parameter.
For example, if you choose a Patch size of 256, the dataset will be cut into sub-sections of 256 x 256 pixels. These subsections will then be used for training. By subdividing images, each pass or 'epoch' should be faster and use less memory. However, patches should be large enough to fully enclose the object(s) to be detected. Note Smaller patch sizes generally yield more truncated boxes resulting from boundary artifacts. You should use the biggest possible patch size for the best results |
| Class count |
Class count is the number of object classes to detect. For example, a kidney detection model usually has two classes — 'Left Kidney' and 'Right Kidney'.
Note The entered Class count should not include the background labeled in the multi-ROI. A background class is added automatically. For example, if you choose '4' as the class count, then the classes that appear on the Train tab will be: Background, Class 1, Class 2, Class 3, and Class 4. |
| Input count | Input count is number of channels in the input dataset — '1' for grayscale and '3' for color images with red, green, and blue channels. |
- Choose Artificial Intelligence > Custom Deep Model Architectures > YOLOv3 on the menu bar.
The YOLOv3 dialog appears.
- Click the New button on the Train tab.
The Model Information dialog appears.
- Enter a name and description for the new model, as required.
- Choose the required Patch size in the drop-down menu, as shown below.

Recommendation Smaller patch sizes generally yield more truncated boxes resulting from boundary artifacts. You should use the biggest possible patch size for the best results.
- Choose the required Class count, as shown below.

Note The entered Class count should not include the background labeled in the output multi-ROI. A background class is added automatically. For example, if you choose '4' as the class count, then the classes that appear on the Train tab will be: Background, Class 1, Class 2, Class 3, and Class 4.
- Choose the Input count, as shown below.

Input count is the number of channels in the input dataset — '1' for grayscale and '3' for color images with red, green, and blue channels.
- Click OK.
After processing is complete, the model appears in the Model list.
Note A background class is assigned automatically and your labeled classes in the output multi-ROI must match this.

- Continue to the topic Training Models for Object Detection to learn how to train your new model.
You can start training a YOLOv3 model for object detection after you have prepared your training input(s) and output(s), as well as any required masks (see Prerequisites).
- Open the YOLOv3 dialog, if it is not already onscreen.
To open the dialog, choose Artificial Intelligence > Custom Deep Model Architecture > YOLOv3 on the menu bar.
- Do one of the following, as required:
- Generate a new model for object detection (see Generating New Models).
- Select an untrained or trained model from the Model list that contains the required number of classes and inputs.
Note In this case, you will need to click the Load button to load the selected model.
- Rename the classes and/or assign new colors to the classes, optional.

- Do the following in the Inputs box for each set of training data that you want to train the model with:
- Choose your training dataset in the Input drop-down menu.
Note If your model requires multiple inputs, for example when working with color images, select the additional input(s), as required.

- Choose the labeled multi-ROI in the Output drop-down menu.
Note Only multi-ROIs with the number of classes corresponding to model's class count and that have the same geometry as the input dataset will be available in the menu.
- Choose a mask in the Mask drop-down menu, optional.
Note If you are training with multiple training sets, click the Add New
button and then choose the required input(s), output, and mask for the additional item(s).The completed Inputs should look something like this:

- Choose your training dataset in the Input drop-down menu.
- Adjust the Data augmentation settings, as required (see Data Augmentation Settings).
In most cases, the default values should give good results.
- Adjust the Training parameters, as required (see Training Parameters).
Note You should monitor the estimated memory ratio when you choose the training parameter settings. The ratio should not exceed 1.00 (see Estimated memory ratio).
- Click the Train button.
The dataset is validated and then automatically split into training and validation sets before training begins. You can monitor the progress of training in the Training Model dialog, as shown below.
During training, the quantities 'loss' and 'val_loss' should decrease. You should continue to train until 'val_loss' stops decreasing.
Note You can also click the List tab and then view the training metric values for each epoch.
- Wait for training the completed. You can also stop training at anytime, if required.
- Evaluate the results for the training session, recommended (see Evaluating Training Results).
- Generate previews of the test set to evaluate the model, recommended (see Previewing Training Results).
- If the results are not satisfactory, you should consider doing one or more of the following and then retraining the model:
- Add an additional training set.
- Create a mask centered on problematic areas (see Creating Mask ROIs).
- Adjust the data augmentation settings (see Data Augmentation Settings).
- Adjust the training parameter settings (see Training Parameters).
- When the model is trained satisfactorily, click the Save button to save your object detection model.
- Apply the model to the original dataset or to similar datasets (see Applying Object Detection Models), as required.
You can preview the results of training on an image slice of a selected dataset with the options in the Preview box, shown below.
Preview
- Choose to show or hide a class by changing the visibility of a class in the Classes box, optional.
- Select the dataset that you want the model to be applied to in the drop-down menu.
- Click the Apply button.
The preview appears in the selected view.
